Skip to content

spike(stark): symbolic-field capture of transition constraints → flat IR (CPU)#737

Draft
MauroToscano wants to merge 5 commits into
mainfrom
spike/constraint-ir-symfield
Draft

spike(stark): symbolic-field capture of transition constraints → flat IR (CPU)#737
MauroToscano wants to merge 5 commits into
mainfrom
spike/constraint-ir-symfield

Conversation

@MauroToscano

Copy link
Copy Markdown
Contributor

What

A CPU-only, minimal spike for capturing algebraic transition constraints into a flat,
single-field Goldilocks IR and interpreting it — the first step toward evaluating
constraints on the GPU.

The motivation is not to speed up constraint evaluation (it isn't the CPU bottleneck).
It's data residency: once LDE/Merkle/FRI run on the GPU, leaving constraint evaluation
on the CPU forces a round-trip of the (large) LDE trace across the bus, which dominates. To
keep the pipeline on-device we need the constraint logic in a GPU-executable form — which
means capturing it into an IR a kernel can interpret (the approach SP1, OpenVM, and zisk all
use; OpenVM is our closest match — FRI-STARK / LDE quotient).

This PR proves the capture half works, on the CPU, without touching any constraint body.

What's in it

New module crypto/stark/src/symbolic/:

  • sym_field.rsSymField/SymExt recording fields: zero-size markers implementing
    IsField (+ IsSubFieldOf<SymExt>) whose ops record nodes into a thread-local
    hash-consing arena instead of computing.
  • ir.rs — the IR: a typed Dim1/Dim3 op-DAG (Add/Sub/Mul/Neg/Embed + Const/Var leaves).
  • capture.rs — runs a constraint's existing evaluate::<SymField, SymExt> and snapshots
    the recorded arena into a ConstraintProgram. Zero edits to constraint bodies.
  • interp.rs — a CPU interpreter: forward pass over the nodes reusing real FieldElement
    arithmetic (so per-op results are bit-identical to the boxed path).

Test (prover/src/tests/symbolic_ir_tests.rs): captures IsBit (cond + uncond),
AddConstraint (both carries), and ProductZero, then asserts the IR interpreter matches
the real evaluate bit-for-bit over 1000 random rows each.

Results

  • stark and lambda-vm-prover build; cargo fmt/clippy clean.
  • ✅ Diff test passes 4/4 — including AddConstraint, which exercises from(i64)
    coefficients and the inv_2_32 constant.
  • Feasibility confirmed: SymField needs only IsField + IsSubFieldOf<SymExt>
    not IsFFTField/IsPrimeField — because capture never instantiates an
    AIR<Field=SymField>. The methods that can't be symbolic (inv, div, real
    ByteConversion) are provably unreachable and left as unimplemented!(); eq returns
    false so runtime zero-skip optimizations aren't taken during capture.
  • Per-constraint node counts: 66–78 — but that includes 64 always-recorded column leaves
    (a spike artifact of building all NUM_COLS leaves); the real per-constraint logic is a
    handful of nodes (e.g. product_zero = 64 leaves + zero + 1 Mul). A leaf-on-demand /
    DCE pass removes the padding.

Out of scope (follow-ons)

  • The GPU interpreter kernel (mirroring OpenVM's quotient.cu + transpiler/codec —
    three-address code, register allocation, fused Σ αⁱ·Cᵢ·Zᵢ⁻¹ accumulation on-device).
  • LogUp framework constraints (LookupBatchedTerm/LookupAccumulated) — capturable the
    same way (their inner fns are already field-generic) but need the challenge/alpha-power
    context; left out of the minimal set.
  • Wiring the interpreter into ConstraintEvaluator::evaluate_transitions.

Notes

Design docs (motivation, the two capture approaches considered, OpenVM/SP1/zisk findings,
recommendation) are in thoughts/gpu-constraint-eval/. This is a draft spike, not
intended to merge as-is — it de-risks the SymField-as-IsField question and the
capture→IR→interpret equivalence before investing in the GPU kernel.

… flat IR

CPU-only proof-of-concept toward evaluating constraints on the GPU (to keep the prove pipeline data-resident). Adds crypto/stark/src/symbolic/: SymField/SymExt recording fields (impl IsField + IsSubFieldOf via a thread-local hash-consing arena), a typed Dim1/Dim3 op-DAG IR, a capture front-end, and a CPU interpreter.

Reuses constraint bodies unchanged: running evaluate::<SymField,SymExt> records the IR. SymField needs only IsField + IsSubFieldOf (capture never builds an AIR<Field=SymField>); inv/div/ByteConversion are unreachable stubs and eq returns false.

Diff test (prover) captures IsBit/Add/ProductZero and asserts the IR interpreter matches the real evaluate bit-for-bit over 1000 random rows. Out of scope (follow-ons): GPU kernel, LogUp constraints, wiring into evaluate_transitions. Design notes in thoughts/gpu-constraint-eval/.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant